Speaker normalization based on test to reference speaker mapping
نویسندگان
چکیده
The paper presents the speaker normalization technique we implemented in a teaching and training system for hearing handicapped children with the goal to reduce inter-speaker variability in time-frequency speech representation. In an effort to reduce variance caused by variation in vocal tract shape among speakers, a formant based nonlinear frequency warping approach to vocal tract normalization is investigated. The proposed method can be efficiently realized in an Analysis by Synthesis framework. After the speech decomposition into the vocal tract envelope and excitation model, the vocal tract envelope is warped by the estimated frequency warping function, while the excitation characteristics are mapped to the reference speaker excitation. The results have shown significant spectral distance decrease for correctly pronounced words between test and the reference speaker after the normalization has been applied, while for poor pronunciation by the test speaker the spectral distance remains relatively high.
منابع مشابه
Improved Speaker Markov Modelling for Unsupervised Speaker Normalization
We propose new methods of improved speech recognition with speaker-variable Information. Hidden Markov Model-based recognizers which are trained by reference speaker(s) (RS) are normalized by our two different approaches to give a better speaker-independent recognition rate. Our normalization methods are based on the same principle of inter-speaker Markov mapping. This mapping gives inter-speak...
متن کاملAcrobat Distiller, Job 2
The aim of the work described in this paper is to develop and evaluate the speaker normalization technique based on the test to reference speaker mapping. The method is suitable for uniform time-frequency representation of speech used in speech corrector systems. The normalized spectrum is generated after the analysis by synthesis for the given utterance using the MBE (multiband excitation) cod...
متن کاملA new cohort normalization using local acoustic information for speaker verification
This paper describes a new cohort normalization method for HMM based speaker verification. In the proposed method, cohort models are synthesized based on the similarity of local acoustic features between speakers. The similarity can be determined using acoustic information lying in model components such as phonemes, states, and the Gaussian distributions of HMMs. With the method, the synthesize...
متن کاملModeling speech imitation and ecological learning of auditory-motor maps
Classical models of speech consider an antero-posterior distinction between perceptive and productive functions. However, the selective alteration of neural activity in speech motor centers, via transcranial magnetic stimulation, was shown to affect speech discrimination. On the automatic speech recognition (ASR) side, the recognition systems have classically relied solely on acoustic data, ach...
متن کاملFrame level likelihood normalization for text-independent speaker identification using Gaussian mixture models
In this paper we propose a new speaker identi cation system, where the likelihood normalization technique, widely used for speaker veri cation, is introduced. In the new system, which is based on Gaussian Mixture Models, every frame of the test utterance is inputed to all the reference models in parallel. In this procedure, for each frame, likelihoods from all the models are available, hence th...
متن کامل